Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix nat_gateway_enabled=false InvalidNatGatewayId.Malformed error #45

Closed

Conversation

paulrob-100
Copy link
Contributor

@paulrob-100 paulrob-100 commented Apr 19, 2021

what

  • Fix Setting the nat_gateway_enabled=false results in Invalid index error #44
  • Error reported is 'InvalidNatGatewayId.Malformed: Invalid id: "0" (expecting "nat-...")'
  • Make nat_gateway_enabled variable type explicit boolean
  • Remove dummy_az_ngw_ids and return empty map when disabled, which means private.tf nat_gateways_route_count is not dependent on the results of the apply
    • Tried using a local list with conditional comparison to the "0" constant but then dependent on the results of the apply
  • Added new fixture in eu-west-1 region and new test
  • terraform get -update fails when running tests in parallel hence disabled
  • Tested route tables with nat_gateway_enabled flag disabled and enabled (tests would need an extra output to test in code)

why

  • Users should be able to create subnets without NAT gateways

references

…private subnets aws_route resource

* Error is 'InvalidNatGatewayId.Malformed: Invalid id: "0" (expecting "nat-...")'
* Make nat_gateway_enabled variable type explicit boolean
* Remove dummy_az_ngw_ids and return empty map when disabled, which means private.tf nat_gateways_route_count is not dependent on the results of the apply
  * Tried using a local list with conditional comparison to the "0" constant but then dependent on the results of the apply
* Added new fixture in eu-west-1 region and new test
* terraform get -update fails when running tests in parallel hence disabled
* Tested route tables with nat_gateway_enabled flag disabled and enabled (tests would need an extra output to test in code)
@paulrob-100 paulrob-100 requested review from a team as code owners April 19, 2021 17:07
@paulrob-100 paulrob-100 requested review from dotCipher and woz5999 and removed request for a team April 19, 2021 17:07
@Gowiem
Copy link
Member

Gowiem commented Apr 19, 2021

/test all

Copy link
Member

@Gowiem Gowiem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulrob-100 this looks good to me and you obviously put a good bit of work into it so thank you. I'm going to call in a 2nd reviewer here since this isn't a trivial change. @Nuru you mind taking a look at this when you get the chance?

@Gowiem Gowiem requested a review from Nuru April 19, 2021 20:56
@paulrob-100
Copy link
Contributor Author

Thanks @Gowiem. I'm looking at the failing test to see if I can reproduce.

--- FAIL: TestExamplesCompleteNoNatGateway (48.00s)
    destroy.go:11: 
        	Error Trace:	destroy.go:11
        	            				examples_complete_test.go:126
        	Error:      	Received unexpected error:
        	            	FatalError{Underlying: error while running command: exit status 1; 
        	            	Error: Invalid count argument
        	            	
        	            	  on ../../private.tf line 96, in resource "aws_route" "default":
        	            	  96:   count = local.nat_gateways_route_count

@paulrob-100
Copy link
Contributor Author

I'm not able to reproduce the failing test locally @Gowiem.
I'm using 0.14.7, but there is some logic in the build harness which is selecting 0.13.6.
The test itself is passing, but then the destroy fails with the error. Perhaps a terraform bug?
Any ideas how to fix?

@paulrob-100
Copy link
Contributor Author

Design note:-
The local nat_gw_availability_zones_map introduced in public.tf will conflict with the alternative list of tuple implementation introduced in #47
https://github.com/cloudposse/terraform-aws-multi-az-subnets/pull/45/files#diff-cc32539d62c7383a06d2f5a9a3e22fdc92a499303657ead4f0ddda0eeb82516dR132

I think I prefer #47 implementation which means this local disappears in the merge.

@nitrocode
Copy link
Member

nitrocode commented Apr 21, 2021

Error: Invalid count argument

  on ../../private.tf line 96, in resource "aws_route" "default":
  96:   count = local.nat_gateways_route_count

The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.
My attempt at fixing

It looks like the example that it's trying to apply depends on module.public_subnets.az_ngw_ids for the az_ngw_ids input which turns into

module "private_subnets" {
source = "../../"
availability_zones = var.availability_zones
vpc_id = module.vpc.vpc_id
cidr_block = local.private_cidr_block
type = "private"
# Map of AZ names to NAT Gateway IDs that was created in "public_subnets" module
az_ngw_ids = module.public_subnets.az_ngw_ids

Perhaps a depends_on argument would be good enough to add to module "private_subnets" ?

  depends_on = [module.public_subnets]

That should hopefully allow module.public_subnets to fully finish prior to applying module.private_subnets.

Edit: I tried it and it didn't work. Interesting issue.

@paulrob-100
Copy link
Contributor Author

paulrob-100 commented Apr 21, 2021

Thanks for your investigation @nitrocode.
I've found a number of similar issues reported in the terraform repo.
This one seems similar and wasn't fixed in the 0.13 series.
hashicorp/terraform#26078

However in this case I am using variables or derived local values which are known at plan time, as recommended.

@paulrob-100
Copy link
Contributor Author

Also found this other one hashicorp/terraform#26207

The module design seems very similar here with the root module passing the output of one module to another.

This one also fixed by the same merge hashicorp/terraform#27849

@Nuru
Copy link
Contributor

Nuru commented Apr 26, 2021

@paulrob-100 Thank you for your submission, we really appreciate it. However (@Gowiem please note) we have another submission, #47, which is a more complete solution, so we are going to focus on that.

I greatly appreciate that you added tests. For future reference (and although you would not know it from just looking at this module), we only run tests in us-east-2 for logistical reasons; please do not run tests elsewhere. We use a random number attribute so that we can run multiple test in the same region.

@paulrob-100
Copy link
Contributor Author

Thank you for your review and work on #47 @Nuru. Much appreciated.

I noticed that the test I added in this PR hasn't been added to the #47 merge.
I'll take on board your guidance on us-east-2 testing and resubmit if still valuable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setting the nat_gateway_enabled=false results in Invalid index error
5 participants